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INTRODUCTION 


The  Center  for  Seismic  Studio*  has  been  undsr 
development  for  ssvsral  ysars  undsr  a  projoct  directed  by  ths 
Defense  Advanced  Research  Projects  Agency  (DARPA) .  Development 
of  the  physical  and  software  facilities  of  the  Center,  and  the 
research  to  be  accomplished  at  the  Center  using  these 
facilities,  are  part  of  DARPA 's  program  to  develop  techniques 
that  can  be  used  to  Improve  national  capabilities  to  verify 
compliance  with  potential  nuclear  test  ban  treaties. 

During  an  early  phase  of  the  work,  a  system  for 
receiving,  preprocessing,  storing  and  retrieving  seismic  data 
was  designed  and  subjected  to  proof-of-principle  testing.  The 
primary  technical  challenges  were  associated  with  the  coherent 
management  of  large  volumes  of  data  received  in  numerous 
different  formats,  arriving  by  several  different  means  ranging 
from  mailed  tapes  to  instantaneous  electrical  transmission,  and 
including  both  digital  waveform  and  alphanumeric  data. 

The  following  phase  included  establishing  a  facility 
and  constructing  a  prototype  system  to  test  the  design  concepts 
and  developing  selected  data  bases  to  support  seismic  research. 
A  test  of  the  ability  of  the  prototype  data  center  to  handle 
high-data-rate  information  was  provided  by  receiving  on-line 
digital  waveform  data  from  five  North  American  seismic  stations 
operated  under  auspices  of  the  Department  of  Energy  (the 
so-called  Regional  Seismic  Test  Network-RSTN) .  Similarly, 
arrangements  were  made  to  receive  alphanumeric  data  from  the 
National  Earthquake  Information  Service  (NEIS) ,  from  selected 
stations  operated  by  Canada  and  the  United  Kingdom,  and  from 
other  sources.  This  phase  also  included  the  development  of 
software  for  preprocessing  the  Incoming  data  and  for  performing 
elementary  scientific  analyses  of  the  data.  Most  of  this  work 


has  progressed  to  the  point  that  all  essential  capabilities  can 
be  demonstrated,  but  there  are  engineering  or  software  problems 
remaining  throughout  the  system. 

The  goals  of  the  current  phase  are  to  complete  the 
development,  test  and  evaluation  of  the  prototype  data  center, 
and  to  establish  the  center  as  a  focal  point  for  research  and 
support  to  the  DARPA  nuclear  test  verification  program.  This 
entails  establishing  a  management  structure  to  guide  the 
Center's  activities,  a  small  resident  research  staff,  and  a 
developmental  staff.  Principal  developmental  tasks  include 
completion  of  the  hardware  and  software  systems  associated  with 
routine  data  acquisition,  evaluation  and  documentation  of  the 
performance  of  these  systems  in  an  essentially  automatic  mode, 
and  the  development  of  user-f ri endl y  software  for  seismic 
analysis.  Principal  research  tasks  include  application  of  the 
Center's  data  and  computing  resources  to  the  development  and 
testing  of  seismic  analysis  methods,  designing  special  research 
data  bases,  and  conducting  studies  and  evaluations  of  U.S. 
seismological  capabilities. 

This  report  describes  technical  activities  directed 
toward  these  goals  during  the  first  quarter  of  FY  1904. 

ACCOMPLISHMENTS 

Work  during  the  first  quarter  concentrated  on  the 
establishment  of  priorities  for  further  developmental  activities 
and  the  initiation  of  work  for  advancing  Center  objectives.  A 
systematic  effort  was  undertaken  to  evaluate  the  status  of  the 
various  capabilities  and  resources,  namelyi 

e  computers  (three  VAX  11/780's  and  two  PDP  11/44's) 
and  a  range  of  peripheral  devices  (mainly, 
printers,  storage  devices,  graphics  devices,  and 
remote  terminals)} 


•  computer  software,  including  operating  systems, 
on-line  data  processing  programs,  data  base 
management  software,  graphical  display  and  demo 
packages,  applications  programs,  and  research 
tool s; 

e  data  acquisition  systems,  notably,  systems  for 
receiving  telemetered  digital  data} 

e  archived  data  bases}  and 

e  documentation  of  functions,  procedures,  and 

makeup  of  the  various  system  components. 

This  systems  evaluation  resulted  in  the  identification 
of  several  important  problem  areas  where  further  developmental 
work  was  needed.  The  performance  of  the  data  acquisition  system 
was  found  to  be  unreliable,  and  the  documentation  for  the 
troublesome  components  (the  communications  interface  system)  was 
found  to  be  inadequate  for  timely  remedial  action.  Computer 
software  was  not  developed,  tested  and  documented  to  the  level 
that  was  expected  or  needed  for  the  Center's  participation  in  an 
upcoming  test  of  an  international  exchange  of  seismic  data  and 
determination  of  source  parameters.  Large  parts  of  the  data  on 
hand  had  not  been  archived  in  the  Center's  standard  retrievable 
data  bases.  Operations  of  the  computer-based  systems  were 
generally  found  to  be  difficult  to  master,  although,  once 
mastered,  these  systems  provide  capabilities  and  ready  access  to 
data  unrivaled  in  the  seismic  research  community. 

In  short,  considerable  effort  was  found  to  be  needed  to 
make  the  various  systems  function  reliably  and  to  provide 
documentation  and  user — friendly  interfaces  to  facilitate  their 
use  by  researchers  unfamiliar  with  the  particular  software 
operating  at  the  Center. 

Appendix  1  summarizes  the  status  of  developments  at  the 
Center  as  of  December  1963  with  respect  to  DARPA  objectives  for 


While  the  primary  activity  during  the  -first  quarter  was 
devoted  to  the  evaluation  o-f  existing  systems,  several  important 
advancements  were  made,  including  specific  measures  that  were 
implemented  to  remedy  deficiencies  noted  above.  The  most 
significant  accomplishments  are  presented  below. 

a.  Steps  were  taken  to  improve  the  utilization  of  computer 
resources  at  the  Center.  A  decision  was  made  to  upgrade  the 
operating  system  to  UNIX  version  4.2,  which  contains  important 
enhancements  such  as  a  needed  facility  for  managing  allocations 
of  storage  space  on  the  computer  disks.  However,  a  companion 
decision  was  to  make  no  further  upgrades  beyond  version  4.2  for 
the  forseeable  future,  because  several  deficiencies  found  at  the 
Center  were  attributable  to  the  past  practice  of  implementing 
successive  versions  of  the  operating  system  as  they  were  issued, 
which  detracted  from  completion  of  development  work  already 
underway. 

A  review  was  conducted  of  individuals  and  organizations 
having  access  to  computers  at  the  Center.  A  new  authorization 
list  was  developed,  which  limits  access  to  those  having  research 
requirements  validated  through  DARPA  and  to  individuals 
providing  direct  support  to  Center  functions. 

The  installed  version  of  the  Ingres  data  base 
management  system  was  compared  with  RTI  Ingres  for  performing 
typical  Ingres  tasks.  Benchmark  tests,  reported  in  Appendix  2, 
show  clear  superiority  of  RTI  Ingres  over  the  installed  version; 
RTI  Ingres  executes  a  common  task  involving  the  selective 
retrieval  of  data  at  more  than  50  times  the  speed  of  the 
installed  Ingres.  This  improved  efficiency  is  fundamental  for 
accessing  data  at  the  Center,  and  therefore,  action  was 
initiated  to  acquire  and  install  the  RTI  version  of  Ingres 
without  interfering  with  data  accessibility. 


b.  Planning  began  in  December  for  an  experiment  to  be 
conducted  Juring  the  third  week  of  January  for  the  purpose  of 
assessing  the  Center's  capabilities  and  testing  procedures  for  a 
subsequent  multi-national  test  of  a  systematic  exchange  of 
seismic  data,  and  using  it  for  determination  of  source 
parameters. 

One  major  element  of  the  January  experiment  was  to 
determine  the  rate  at  which  seismic  data  reports  are  received 
over  the  Global  Telecommunication  System  of  the  World 
Meteorologic  Organization  (WliQ/GTS)  and  the  earliest  times  at 
which  valid  locations  can  be  obtained  for  the  seismic  sources  of 
the  reported  signals.  A  second  major  element  of  the  planned 
experiment  was  to  detect  and  analyze  seismic  signals  from 
telemetered  data  received  at  the  Center  from  the  five  stations 
that  comprise  the  Regional  Seismic  Test  Network  (RSTN) ,  and  to 
transmit  the  measured  signal  parameters  over  the  WMO/GTS 
network.  The  third  element  of  this  experiment  was  to  compare 
two  competing  programs  for  automatically  determining  source 
events  from  the  reported  seismic  signals. 

Procedures  for  the  January  experiment  were  designed  to 
emulate  those  anticipated  for  the  Center's  participation  in  the 
subsequent  large-scale  test,  namely,  to  emulate  the  operations 
of  a  National  and  an  International  Data  Center  as  described  by 
the  Group  of  Scientific  Experts  (Third  Report  to  the  Committee 
on  Disarmament  of  the  ad  hoc  Group  of  Scientific  Experts  to 
Consider  International  Co-operative  Measures  to  Detect  and 
Identify  Seismic ’ Events) . 

c.  Action  was  taken  to  repair  and  upgrade  the  Center's 
capabilities  for  analyzing  waveform  data  using  automatic  and 
semiautomatic  (analyst-assisted)  methods.  The  primary  objective 
was  to  provide  computer-assisted  capabilities  for  measuring 
signal  parameters  as  needed  in  the  upcoming  multi-national  test 


of  a  systematic  exchange  of  seismic  data  and  determination  of 
source  parameters.  These  enhanced  capabilities  should  also 
prove  useful  as  tools  for  conducting  research  at  the  Center. 

An  installed  program  for  detecting  signals  in 
continuous  digital  data  was  found  to  be  malfunctioning  and  was 
fixed.  Additional  work  is  needed  to  extend  the  capability  for 
selectively  distinguishing  local,  regional  and  teleseismic 
signals  and  for  automatically  extracting  additional  signal 
parameters. 

The  seismic  analyst  station  at  the  Center  was  designed 
to  function  using  an  outmoded  version  of  the  operating  system 
and,  consequently,  was  only  partially  operational.  Action  was 
initiated  to  upgrade  the  system  and  make  it  operational  under 
UNIX  version  4.2. 

Preliminary  work  had  previously  been  done  to  develop  a 
remote  seismic  terminal  for  routine  measurement  of  signal 
parameters  using  a  SUN  micro-computer  with  interactive 
graphics.  Our  analysis  of  the  system  showed  that  the  terminal 
was  inherently  capable  of  measuring  complete  signal  parameters 
in  a  semiautomatic  manner,  and  could  also  provide  data  base 
access  and  computational  capabilities  useful  for  research. 
Action  was  initiated  to  complete  the  original  development  for 
routine  measurement  of  signal  parameters  and  to  provide  access 
to  the  Center's  data  bases.  Additional  work  will  be  initiated 
at  a  future  date  to  develop  research  programs  on  the  remote 
seismic  terminal. 

d.  Additional  accomplishments  includedi 

e  Rewriting  and  updating  programs  to  convert  data 
to  the  standard  data  base  formats) 

e  assisting  in  the  installation  of  a  special  data 
base  composed  of  close-in  recordings  of 
explosions) 


CENTER  STATUS  WITH  RESPECT  TO  MAJOR  OBJECTIVES  FT  1963 


1.  DATA  BASE  DESIGN 

Objective:  Complete  database  design 

Status:  Completed  with  adequate  flexibility  to  accommodate  current  needs  includ¬ 
ing  nonstandard  database  constructs. 

Comments:  No  changes  to  the  database  structure  (Version  2.6)  are  anticipated; 

however,  upgrades  in  the  implementation  software  (RTI  INGRES  and  4.2  UNIX)  will 
necessitate  database  upgrades. 

The  'arrival'  relations  are  not  adequate  for  providing  Level  1  parameters  for  interna¬ 
tional  data  exchange;  a  special  construct  will  be  needed. 

Version  2.6  does  not  restrict  the  order  of  records  in  the  assumes  particular  data 
sequences. 

Needs:  Additional  user-oriented  documentation  is  needed  to  describe  the  database 

structure,  including  conceptual  design,  implementation  status,  and  operational  guide¬ 
lines  which  identify  addition  documentation  and  potential  pitfalls. 

2.  DATABASE  ARCHIVING 

Objective:  Achieve  "automatic'Voperational  capability  for  routine  data 

logging  /archiving  into  database: 

WMO,  UK.  YKA 
RSTN 

SRO  (GDSN) 

Status:  The  basic  software  and  operating  procedures  have  been  developed  for 
archiving  three  types  of  data:  events,  arrivals  and  waveforms.  Only  NEIS  events  are 
being  routinely  archived  under  Version  2.8.  Current  procedures  are  summarized 

below. 

Events  -  NEIS  origins  are  being  routinely  archived  (monthly)  under  Version 
2.8.  These  origins  are  later  updated  using  corresponding  ISC  loca¬ 
tions  provided  by  the  USGS.  With  some  effort  these  updates  could 
be  generated  directly  from  ISC  data. 

Arrivals  -  NEIS,  WMO,  UK,  YKA,  CSN  alphanumeric  data  are  being  routinely 
"  archived  on  HUGO  under  Version  1.0.  These  data  are  then  being 
reformated  and  stored  in  a  disorganized  Version  2.6  which  includes 
duplicate  data.  Programs  are  currently  being  tested  for  (l)  remov¬ 
ing  duplicates  and  (2)  renumbering  the  data  sequences. 

Waveform  -  RSTN  data  (without  detection)  Eire  not  being  routinely  archived 
because  of  CIS/Network  problems,  which  are  currently  being 
addressed.  SCARS  tapes  are  being  generated  as  backup. 


Waveform  -  USGS  Network  Day  tapes  denoted  GDSN  (SRO,  ASRO,  DWWSN,  RSTN) 
are  being  archived  ("-monthly)  into  a  pre-2.6  Version  of  the  data¬ 
base.  Existing  software  makes  this  incompatibility  largely  tran¬ 
sparent  to  the  user. 


&  EXISTING  DATA  BASES 

Objective:  Install  research  databases: 

Yield 

Explosions 

Discrimination 

IDCE  1900  Continuous  (1-15  October  and  1-7  November) 

IDCE  1962  Synthetic 

Earthquakes  (assumed  to  be  waveforms  from  GDSN) 

NE1S.  ISC 

Status:  The  majority  of  these  databases  exist  and  are  being  maintained  at  the 
Center  on  either  HUGO  or  SEISMO.  Few,  if  any,  of  the  databases  have  undergone  final 
editing.  Future  improvements  include  such  things  as  -  updates  to  Version  2.8,  updates 
to  accommodate  RTI  INGRES,  ISC  updates,  corrections  to  recorder  characteristics,  and 
general  removal  of  data  glitches  as  they  are  discovered.  Fragmented  information  on 
the  characteristics  and  status  of  the  most  significant  databases  is  as  follows: 

IDCE  1900  Continuous  Data  -  ‘IDCE’  on  HUGO  -  Contains  the  U  S.  data  for  the 
1-15  Oct  Experiment.  Non-U.S.  data  for  the  1-15  Oct  Experiment  and 
the  1-7  Nov  data  are  not  identifiable  databases  at  the  CSS. 

IDCE  1902  Synthetic  -  Unidentifiable  at  the  CSS 

Discrimination  -  Unofficial  version  on  JANUS,  which  does  not  have  INGRES, 
therefore  not  usable  within  the  context  of  the  database  system. 

YIELD  -  'YIELD'  on  HUGO  -  The  identifiers  within  this  database  have  incon¬ 
sistencies. 

Explosions  -  'EXPLOSION'  on  HUGO  -  Contains  essentially  complete  data 
(unclassified)  on  both  chemical  and  nuclear  explosions  over  the  time 
period  January  1964  through  1902,  apparently  including  information 
contained  in  'YIELD'  (Undesirable  to  maintain  both  databases).  Data 
prior  to  1964  are  included  for  underground  explosions  but  not  for 
underwater  or  atmospheric  explosions.  The  database  is  probably 
reliable  and  should  be  updated  as  more  recent  information  becomes 
.  available. 

GDSN  -  'GDSN76'...'GDSN02'  and  ’master83’  on  HUGO  -  pre-2.6  Version  with 
tailored  access  software. 

NEIS(arrivals);  ISC  -  'pre82’and  'a3'  on  HUGO  -  ISC  arrivals  are  not  archived. 

NEIS(events);  ISC  -  'events'  on  SEISMO 

DEMO  -  'DEMO'  on  HUGO 


RSTN  -  Not  currently  being  archived  (apart  from  SCARS  tapes  and  GDSN). 

Needs:  Additional  effort  should  be  devoted  to  the  assessment  of  database  status 

followed  by  a  plan  with  priorities  and  responsibilities  for  updating,  correcting,  and 
documenting  databases. 

4.  INFORMATION  EXTRACTION  AND  DATABASE  CONVERSION 

Objective:  Complete  automated  algorithms  conversion;  implement  "routine" 
processing /database  Installation  for:  DP,  Post-DP,  AA,  LOC(TG.LL) 

Status:  The  basic  software  exists  for  performing  the  various  operations  involved  in 
detection  (DP),  arrival  identification  (Post-DP),  automatic  association  (AA).  and  loca¬ 
tion  (LOC);  however,  the  combination  of  errors  in  the  individual  software  modules  and 
incompatibilities  in  the  formating /network  protocol  prevent  routine  processing  of  con¬ 
tinuous  waveform  data. 


5.  DATABASE  ACCESS 

5. 1  Objective:  Complete  FORTRAN  interfaces  into  applications  programs 

Status:  Not  implemented.  Currently  all  of  the  applications  programs  work  from 
"external"  file  structures  ("external"  to  INGRES).  Some  time  in  the  past  there  was 
thought  to  implement  subroutines  which  would  access  data  directly  from  the  INGRES 
data  bases  but  someone  deemed  that  the  resulting  programs  would  be  slow,  thus  the 
whole  idea  was  abandoned.  Special  programs  now  exist  which  extract  data  from 
INGRES  data  bases  and  write  special  files  to  be  used  by  applications  programs.  The 
drawback  of  this  process  is  that  one  must  know  the  subset  of  data  to  be  used  by  an 
applications  program  before  that  program  is  run. 


5.2  Objective:  Design/implement  user  scripts  for  common  database  queries 

Status:  Some  of  these  exist.  It  is  difficult  to  predict  a  common  set  of  possible 
queries.  Perhaps  the  emphasis  should  be  placed  on  the  documentation  of  the  current 
data  base  design  from  a  seismological  rather  than  a  computerese  point  of  view. 

5.3  Objective:  Design/develop  user  tutorials  in  anticipation  of  research  support  in  FT 
1984 

Status:  Some  standard  system  related  tutorials  exist  (e.g.  UNIX,  INGRES)  but  the 
orientation  is  toward  the  computer  professional.  Even  so,  it  takes  a  significant  amount 
of  time  and  effort  to  locate  relevant  software  subroutine  libraries  for  many  of  the  peri¬ 
pheral  devices  at  the  Center  (e.g.  Megatek,  Tektronix,  Versatek).  An  effort  was  made  to 
document  Center  specific  software  (Volume  II  of  the  S-cubed  document),  unfortunately, 
the  method  emulates  the  UNIX  Manual  page  style  which  turns  out  to  be  notoriously 
unfriendly  to  users  that  are  unfamiliar  with  UNIX.  A  significant  amount  of  effort  is 
needed  to  turn  the  Center's  orientation  toward  seismological  usefulness. 


6.  INTERNATIONAL  DATA  EXCHANGE 


Objective:  Insure  Center  capability  to  run  operational  international 

datacenter /treaty  participant 

Status:  To  date,  special  procedures  have  been  tailored  to  accomplish  this  task  with 
significant  analyst  intervention. 

8.1  Task:  Data  acquisition 

Status:  The  Communications  Interface  System  (CIS,  11/44)  is  being  used  to  handle 
WMO,  UK,  etc.  parameter  data,  The  CIS  software  is  undergoing  evaluation  in  order  to 
identify  the  problems.  The  problem  may  be  with  the  CIS  software  or  the  local  network 
software  or  possibly  both.  The  acquisition  of  parameter  data  is  being  accomplished  on 
a  routine  basis  but  the  acquisition  of  waveform  data  has  some  serious  problems. 

8.2  Task:  Parameter  extraction /MSMT 

Status:  Parameter  extraction  is  taking  place  with  certain  qualifications.  A  FORTRAN 
program  was  obtained  from  NEIS  and  modified  for  the  CIS,  some  of  the  routines  were 
rewritten  in  C  language  in  order  to  speed  up  the  process.  Some  of  the  parameters  are 
not  being  extracted  from  the  incoming  parameter  data. 

6.3  Task:  Data  exchange 

Status:  Data  is  received  at  the  Center,  none  is  sent  out. 

6.4  Task:  Automatic  Association  (Level  1) 

Status:  A  systematic  evaluation  of  the  AA  is  in  order,  because  the  various  versions  in 
use  at  the  Center  give  conflicting  results. 

8.5  Task:  Location  (Level  1) 

Status:  There  is  no  operational  location  program  at  the  Center  apart  from  that  con¬ 
tained  in  AA. 

6.6  Task:  Bulletin  preparation/distribution 

Status:  Software  exists  for  generating  bulletins,  however,  standards  have  not  yet 
been  established.  Bulletins  are  not  produced  on  a  routine  basis.  The  distribution 
mechanism  exists. 

6.7  Task:  Incorporation  of  waveforms  in  "routine"  processing 

Status:  Because  of  the  problems  associated  with  the  CIS-to-HUGO  transfer  of 
waveform  data,  it  is  not  being  used  on  a  routine  basis. 

8.8  Task:  Documented  procedures 

Status:  Documentation  is  lacking  for  much  of  the  processing  software.  A  task 
oriented  procedure  has  been  Implemented  for  reducing  this  deficiency. 


7.  RESEARCH  SUPPORT 


Objective:  Develop  plan  for  supporting  researchers 

Status:  Considerable  expertise  or  perseverance  is  currently  needed  by  researchers 
to  utilize  the  Center  facilities. 

7.1  Objective:  Provide  data /applications  support 
Status:  This  is  done  on  an  ad  hoc  basis. 

7.2  Objective:  Integrate  third  VAX;  other  equipment 
Status:  Completed. 

7.3  Objective:  Initiate  classified  processing 
Status:  This  is  routinely  performed. 

7.4  Objective:  Support  remote  researchers 

Status:  Generally  remote  users  fall  into  two  categories: 

( 1)  logging  in  from  remote  sites  on  a  continual  basis, 

(2)  coming  to  the  center  to  perform  specific  tasks  for  a  specific  duration.  Users  in 
the  first  category  are  fairly  knowledgeable  about  the  Center's  operating  system  and  its 
resources.  Their  requests  relate  to  specific  subsets  of  data  from  the  Center's  data 
bases.  The  second  category  of  users  need  more  specialized  attention  because  they  are 
new  to  UNIX  or  they  have  tapes  which  they  need  to  put  on  the  CSS  computers  or  they 
have  special  communication  requirements.  All  these  functions  are  satisfied  by  the 
Center's  staff  each  according  to  his/her  expertise. 
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APPENDIX  2. 

KVSLUAnoN  or  Ingres  at  the  center  tor  seismic  studies 

INGRES  i a  a  relational  data  base  management  tool  being  used  at  the  CSS  for 
seismic  data  bases.  There  sue  at  least  two  versions:  one  that  comes  with  the  standard 
UNIX  distribution  (so-called  Berkeley  INGRES),  and  one  that  has  been  used  at  CSS  as  a 
beta  test-site  for  Relational  Technology,  Inc.  (RTI).  During  the  latter  part  of  Oct.  and 
early  Nov..  1963,  CSS  personnel  exercised  INGRES  in  several  different  configurations  on 
different  data  bases  and  operating  systems.  The  following  configurations  were  used: 

1.  Berkeley  4. 1A  UNIX  with  RTI  2.0  INGRES  (RI). 

9.  Berkeley  4. 1C  UNDCwith  Berkeley  7.10  INGRES  (Bl). 

a  Berkeley  4.2  UNIX  with  Berkeley  7. 10  INGRES  (B2). 

The  reason  for  the  last  benchmark  was  that  the  4.2  UNIX  ’promised’  that  the 
input /output  (I/O)  execution  speed  would  be  faster  under  4.2  than  under  earlier  ver¬ 
sions.  We  found  quite  the  opposite  to  be  true  in  most  circumstances. 

Three  different  'typical'  tasks  were  posed  to  INGRES: 

Tl.  modify  wftape  to  cisam  on  date 

T3.  retrieve  (origin,  orid, arrival. arid) 

where  originorid-assoc.orid  and 
assoc .  arid=arrival.  arid 

T3.  replace  wftape  (remar k="this  is  non  blank") 

The  first  task  is  similar  to  one  that  would  be  performed  by  a  data  base  administra¬ 
tor  at  the  CSS,  it  is  ordering  a  data  base  by  a  certain  attribute;  in  this  case,  date.  The 
second  task  Is  a  query  of  the  data  base  searching  for  certain  attributes  from  three 
relations,  namely:  origin,  assoc  and  arrival.  This  type  of  query  may  be  invoked  by  a 
user  of  the  data  base.  The  third  task  directs  INGRES  to  replace  the  value  of  the 
'remark'  attribute  by  the  specified  string.  This  type  of  task  may  be  given  to  INGRES  by 
a  user  or  data  base  administrator. 

The  first  data  base  attepmted  to  be  used  was  the  GDSN76  which  consists  of  nearly 
44,000  records.  Unfortunately,  for  this  data  base,  T2  and  T3  under  the  Bl  configuration 
had  to  be  terminated  after  about  12  hours  of  wall-clock  time.  An  attempt  was  made  to 
run  all  of  these  benchmarks  at  night  so  as  to  create  a  favorable  environment  for  the 
completion  of  the  tasks.  A  smaller  data  base  (EXPLOSION)  was  selected  for  the  rest  of 
the  benchmarks  which  contains  about  9,300  records.  Actually,  Tl  ran  to  completion 
under  Bl  but  the  results  are  no  different  than  the  following,  so  they  will  not  be 
presented. 

The  statistics  presented  by  the  UNIX  'time'  command  were  recorded.  They  include 
the  CPU  execution  time  along  with  the  wall-clock  time  and  the  percentage  of  CPU  allo¬ 
cated  to  the  task.  The  manner  in  which  the  computer  presents  the  two  types  of  times 
makes  one  or  the*  other  uninteresting  because  the  wall-clock  time  weighted  by  the  per¬ 
centage  of  CPU  allocated  is  nearly  the  same  as  the  CPU  execution  time.  The  minor 
variations  seem  to  be  due  to  the  fact  that  over  the  period  of  the  task's  execution  the 
percentage  of  CPU  can  be  different.  The  times  reported  here  will  be  the  CPU  execution 
time  and  the  user  can  project  potential  wall-clock  times  by  considering  that  during 
peak  usage  periods  a  task  can  get  as  little  as  10%  of  the  CPU  versus  late  night  rates  of 
about  80%.  Other  statistics  examined  concerned  the  amount  of  memory  used  and  the 
number  of  I/O  operations  expended  during  the  execution  of  the  tasks.  There  seems  to 
be  no  straightforward  conclusion  of  the  sort:  ’This  task  Is  taking  longer  because  it  is 
doing  more  I/O  and  It  Is  just  waiting  for  the  disk  to  spin."  For  future  sizing 
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considerations,  RI  can  consume  as  much  as  500MBytes  of  memory  for  query  opera¬ 
tions.  The  following  table  summarizes  the  CPU  execution  times  for  the  various  tasks 
under  the  exarpined  configurations. 


Configurations 

RI  B1  B2  Ratio(Bl/RI)  Ratio(B2/RI) 


Tl 

188.1 

370.9 

355.0 

2.2 

2.1 

2:46.1 

8:10.9 

5:55.0 

T2 

132.8 

7828.0 

9004.5 

59 

68 

2:12.6 

2:10:28 

2:30:04.5 

T3  1475.2  1030.7  1149.0 
24:35.2  17:10.7  19:09.0 


0.7 


0.6 


The  CPU  execution  times  are  given  in  seconds  as  well  as  in  hours:minutes: seconds 
form.  The  common  denominator  for  the  ratio  is  the  CPU  time  consumed  by  the  RI 
configuration,  thus  a  ratio  of  1  or  larger  means  that  RI  runs  faster  by  that  ratio.  It  is 
evident  that  RI  is  faster  for  data  base  administrator  type  of  function  as  in  Tl  and  can 
be  somewhat  slower  as  for  T3.  The  most  startling  fact  is  that  for  query  type  operations 
(T2)  RI  is  faster  by  a  factor  of  80  or  better.  That  means  that  queries  taking  a  matter  of 
minutes  for  RI  wind  up  consuming  hours  under  Bl  and  B2.  The  data  base  used  here  is 
rather  small  by  most  standards,  thus  making  the  Berkely  INGRES  rather  prohibitive  as 
a  meaningful  research  tool.  Since  the  paramount  purpose  of  a  data  base  management 
software  is  to  make  use  of  queries  in  a  linguistically  natural’  manner,  waiting  for  the 
mean  time  between  computer  hardware  failures  to  manifest  themselves  seems  to  be 
more  of  a  problem  rather  chan  a  solution. 


